Query Translation using Wikipedia-based resources for analysis and disambiguation
نویسندگان
چکیده
This work investigates query translation using only Wikipedia-based resources in a two step approach: analysis and disambiguation. After arguing that data mined from Wikipedia is particularly relevant to query translation, both from a lexical and a semantic perspective, we detail the implementation of the approach. In the analysis phase, lexical units are extracted from queries and associated to several possible translations using a Wikipediabased bilingual dictionary. During the second phase, one translation is chosen amongst the many candidates, based on topic homogeneity, asserted with the help of semantic information carried by categories of Wikipedia articles. We report promising results regarding translation accuracy.
منابع مشابه
Cross-Language Retrieval with Wikipedia
We demonstrate a twofold use of Wikipedia for cross-lingual information retrieval. As our main contribution, we exploit Wikipedia hyperlinkage for query term disambiguation. We also use bilingual Wikipedia articles for dictionary extension. Our method is based on translation disambiguation; we combine the Wikipedia based technique with a method based on bigram statistics of pairs formed by tran...
متن کاملA New Method for Applicant of Explicit Semantic Analysis and Word Sense Disambiguation in Concept-based Information Retrieval
previous Information retrieval (IR) systems based on keywords to retrieve and index documents. They may return inaccurate results when different keywords are employed to illustrate the same concept in the documents and in the queries presented by Users. In Concept-based retrieval methods have tried to tackle these troubles by using concept-based comparison between documents and queries. Therefo...
متن کاملJapanese-Spanish Thesaurus Construction Using English as a Pivot
We present the results of research with the goal of automatically creating a multilingual thesaurus based on the freely available resources of Wikipedia and WordNet. Our goal is to increase resources for natural language processing tasks such as machine translation targeting the Japanese-Spanish language pair. Given the scarcity of resources, we use existing English resources as a pivot for cre...
متن کاملTowards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore
Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...
متن کاملJapanese-Chinese Information Retrieval With and Iterative Weightin g Scheme
This paper describes our Japanese-Chinese cross language information retrieval system. We adopt “query-translation” approach and employ both a conventional JapaneseChinese bilingual dictionary and Wikipedia to translate query terms. We propose that Wikipedia can be regarded as a good dictionary for named entity translation. According to the nature of Japanese writing system, we propose that que...
متن کامل